GCP DevOps Engineer
Description
We are looking for a Senior GCP DevOps Engineer with a deep architectural mindset to design and manage our global cloud footprint. This role isn't just about managing tools; it’s about building a resilient, high-availability platform that spans multiple regions and zones, ensuring our services are always-on and lightning-fast for a global user base.
Key Responsibilities
1. High-Availability Architecture
- Multi-Region Strategy: Design and implement resilient architectures across multiple GCP regions and availability zones to ensure 99.99% uptime and robust disaster recovery.
- Traffic Management: Deploy and manage Global Cloud Load Balancing (GCLB) and Cloud DNS to optimize traffic flow and minimize latency.
- Database Reliability: Architect distributed database solutions (e.g., Cloud Spanner, Multi-region Cloud SQL) to maintain data consistency and availability.
2. Core DevOps & Automation
- CI/CD Leadership: Build and optimize sophisticated deployment pipelines using Cloud Build, GitLab CI, or GitHub Actions, focusing on "canary" and "blue-green" deployment patterns.
- Infrastructure as Code (IaC): Standardize all infrastructure via Terraform, utilizing modular designs to ensure consistency across dev, staging, and production environments.
- Configuration Management: Manage environment-specific configurations and secrets using Secret Manager and Config Controller.
3. Performance & Scalability
- Fleet Management: Oversee large-scale Google Kubernetes Engine (GKE) clusters, implementing Multi-cluster Ingress and Anthos for cross-region workload orchestration.
- Auto-scaling & Efficiency: Develop custom scaling metrics to ensure the platform expands seamlessly during peak loads and contracts during idle periods to maintain efficiency.
Required Technical Profile
- Architectural Depth: Extensive experience with GCP Network design, including Shared VPCs, Cloud Interconnect, and VPC Peering.
- Containerization: Mastery of Docker and Kubernetes (GKE), specifically in multi-cluster or multi-region configurations.
- Automation: Expert-level proficiency in Terraform (specifically building reusable modules).
- Resilience Engineering: Proven track record of conducting "Chaos Engineering" or DR drills to test system durability.